Normalization for Automated Metrics: English and Arabic Speech Translation
نویسندگان
چکیده
The Defense Advanced Research Projects Agency (DARPA) Spoken Language Communication and Translation System for Tactical Use (TRANSTAC) program has experimented with applying automated metrics to speech translation dialogues. For translations into English, BLEU, TER, and METEOR scores correlate well with human judgments, but scores for translation into Arabic correlate with human judgments less strongly. This paper provides evidence to support the hypothesis that automated measures of Arabic are lower due to variation and inflection in Arabic by demonstrating that normalization operations improve correlation between BLEU scores and Likert-type judgments of semantic adequacy — as well as between BLEU scores and human judgments of the successful transfer of the meaning of individual content words from English to Arabic.
منابع مشابه
Applying Automated Metrics to Speech Translation Dialogs
Over the past five years, the Defense Advanced Research Projects Agency (DARPA) has funded development of speech translation systems for tactical applications. A key component of the research program has been extensive system evaluation, with dual objectives of assessing progress overall and comparing among systems. This paper describes the methods used to obtain BLEU, TER, and METEOR scores fo...
متن کاملThe Reality of Arabic Fiction Translation into English: A Sociological Approach
English translations of texts associated with Arabic fiction remain largely unexplored from a sociological perspective. Drawing on Pierre Bourdieu’s sociology, this paper aims to examine the genesis of Arabic fiction translation into English as a socially situated activity. Works of Arabic fiction emerged in English translation in the early twentieth century. Since then, this intellectual field...
متن کاملImprovements in machine translation for English/iraqi speech translation
In this paper, we describe techniques for improving machine translation quality in the context of speech-to-speech translation for significantly different language pairs. Specifically, we explore three broad approaches for improving translation from English to Iraqi and vice versa. First, we investigate normalization techniques which address the differences in spoken and written forms of both l...
متن کاملStatistical vowelization of Arabic text for speech synthesis in speech-to-speech translation systems
Vowelization presents a principle difficulty in building text-tospeech synthesizers for speech-to-speech translation systems. In this paper, a novel log-linear modeling method is proposed that takes into account vowel and diacritical information at both the word level and character level. A unique syllable based normalization algorithm is then introduced to enhance both word coverage and data c...
متن کامل‘Repetition’ in Arabic-English Translation: The case of Adrift on the Nile
Abstract This study investigates ‘repetition’ in the English translation of the Arabic Novel, Adrift on the Nile (1993). It aims to explore the communicative functions of ‘repetition’ and to see if these functions have been maintained or lost in the process of translating the Novel. In addition, it seeks to find the translation strategies used in rendering ‘repetition’. To achieve this aim, a d...
متن کامل